Advancing polytrauma care: developing and validating machine learning models for early mortality prediction

Background Rapid identification of high-risk polytrauma patients is crucial for early intervention and improved outcomes. This study aimed to develop and validate machine learning models for predicting 72 h mortality in adult polytrauma patients using readily available clinical parameters. Methods A retrospective analysis was conducted on polytrauma patients from the Dryad database and our institution. Missing values pertinent to eligible individuals within the Dryad database were compensated for through the k-nearest neighbor algorithm, subsequently randomizing them into training and internal validation factions on a 7:3 ratio. The patients of our institution functioned as external validation cohorts. The predictive efficacy of random forest (RF), neural network, and XGBoost models was assessed through an exhaustive suite of performance indicators. The SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) methods were engaged to explain the supreme-performing model. Conclusively, restricted cubic spline analysis and multivariate logistic regression were employed as sensitivity analyses to verify the robustness of the findings. Results Parameters including age, body mass index, Glasgow Coma Scale, Injury Severity Score, pH, base excess, and lactate emerged as pivotal predictors of 72 h mortality. The RF model exhibited unparalleled performance, boasting an area under the receiver operating characteristic curve (AUROC) of 0.87 (95% confidence interval [CI] 0.84–0.89), an area under the precision-recall curve (AUPRC) of 0.67 (95% CI 0.61–0.73), and an accuracy of 0.83 (95% CI 0.81–0.86) in the internal validation cohort, paralleled by an AUROC of 0.98 (95% CI 0.97–0.99), an AUPRC of 0.88 (95% CI 0.83–0.93), and an accuracy of 0.97 (95% CI 0.96–0.98) in the external validation cohort. It provided the highest net benefit in the decision curve analysis in relation to the other models. The outcomes of the sensitivity examinations were congruent with those inferred from SHAP and LIME. Conclusions The RF model exhibited the best performance in predicting 72 h mortality in adult polytrauma patients and has the potential to aid clinicians in identifying high-risk patients and guiding clinical decision-making. Supplementary Information The online version contains supplementary material available at 10.1186/s12967-023-04487-8.


Item Description Page
Table S1 Baseline characteristics of patients from external cohort who died or survived within 72 hours.

Figure S14
Calibrate plots for NN models in training dataset with 10-fold internal cross-validation repeated 100 times.

Figure S19
Rank score of importance for predictors.

Figure S20
Confusion matrix plots, calibration plots, AUROCs, AUPRCs, and DCAs for models in the external cohort.

Figure S21
AUROCs, AUPRCs, calibration plots, and DCA for random forest models in the internal cohort.

Figure S22
AUROCs, AUPRCs, calibration plots, and DCA for random forest models in the external cohort.

/ 25
Figure S1Cohort and sample selection.4Figure S2Proportion of missings for each variable.5Figure S3Comparisons between raw data and imputation data.6Figure S4Grid search method to determine hyperparameters of XGBoost models.7Figure S5Grid search method to determine best rounds of XGBoost models.8Figure S6Neural network interpretation diagram.9

Figure S16 Decision
Figure S16

Figure S17 Decision
Figure S17

Figure S18 Decision
Figure S18

Figure S1 . 25 Figure S2 . 25 Figure S3 . 25 Figure S4 .
Figure S1.Cohort and sample selection.This flow diagram shows patient inclusion and exclusion criteria in each cohort as well as the dataset partition for training, internal and external validation cohorts.Abbreviations: ISS, injury severity score.

Table S1 .
Baseline characteristics of patients from external cohort who died or survived within 72 hours.
† Values are presented as the median (inter-quartile range).¶ Values are presented as number (percentage).* P values between groups were assessed by the Chi-square and Mann-Whitney U tests.Abbreviations: BMI, body mass index; ISS, injury severity score; GCS, glasgow coma scale; BE, base excess.4 / 25